WHAT IS CIAIMED IS: 

1. A method for managing a textual database, comprising 
the steps of: 

transcribing textual data into corresponding semantic 
units; 

storing the textual data in the textual database; and 
generating an index based on semantic units for 

indexing the stored textual data with the corresponding 

semantic units. 

2. The method of claim 1, wherein the semantic units 
comprise syllables. 

3. The method of claim 1, wherein the semantic units 
comprise morphemes. 

4. The method of claim 1, wherein the textual data is 
associated with audio data, and wherein the step of indexing 
further comprises indexing the audio data with the semantic 
units . 

5. The method of claim 1, wherein the step of 
transcribing comprises the step of time-stamping the 
semantic units. 
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6. The method of claim 1, wherein the step of 
transcribing comprises decoding the textual data with a 
recognition system utilizing a language model based on the 
semantic units. 

7. The method of claim 1, wherein the step of 
transcribing is performed using semantic-unit based 
stenography . 

8. The method of claim 1, wherein the step of 
generating an index comprises generating a hierarchical 
index wherein a semantic unit index points to another mode 
of data. 

9. The method of claim 1, further comprising the steps 
of identifying the type of textual data, wherein the step of 
transcribing is performed based on the type of textual data 
identified. 

10. The method of claim 1, further comprising the step 
of converting the index into a universal index which cross- 
references characters of different fonts. 
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11. The method of claim 1, further comprising the step 
of searching the textual database for target textual data 
using the semantic unit index. 

12. The method of claim 11, further comprising the 
step of converting a target word into a string of semantic 
units to perform the searching step. 

13. The method of claim 12, wherein the step of 
converting a target word is performed automatically using a 
character-to-semantic unit mapping table. 

14. The method of claim 11, further comprising the 
step of displaying search results, wherein the target 
textual data is displayed starting from a corresponding 
semantic unit in a user query and commencing one of forward 
and backward for a given length based on a user request. 

15. A program storage device readable by a machine, 
tangibly embodying a program of instructions executable by 
the machine to perform method steps for managing a textual 
database, the method comprising the steps of: 

transcribing textual data into corresponding semantic 
units; 
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storing the textual data in the textual database; and 
generating an index based on semantic units for 

indexing the stored textual data with the corresponding 

semantic units. 

16. A system for managing a textual database, 
comprising : 

a recognition system for transcribing textual data into 

corresponding semantic units; 

a textual database for storing the textual data; and 
an index generator adapted to generate an index based 

on semantic units, wherein the textual data stored in the 

textual database is indexed with the corresponding semantic 

units . 

17. The system of claim 16, wherein the recognition 
system comprises an OCR (optical character recognition) 
system and an AHR (automatic handwriting recognition system) 
for transcribing typed text and handwritten text, 
respectively. 

18. The system of claim 16, wherein the recognition 
system comprises a language model based on semantic units. 
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19. The system of claim 15, further comprising an 
index converter adapted to convert the index into a 
universal index which cross-references characters of 
different fonts for a given language. 

5 20. The system of claim 16, further comprising: 

a query processor adapted to transform a search query 
for target textual data into corresponding semantic units; 
and 

a search engine for searching the textual database 
0 based on the semantic units corresponding to the search 

query. 

21. The system of claim 20, further comprising an 
automatic word boundary marking system that is applied to a 
search query. 
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